Robust acoustic domain identification with its application to speaker diarization

نویسندگان

چکیده

With the rise in multimedia content over years, more variety is observed recording environments of audio. An audio processing system might benefit when it has a module to identify acoustic domain at its front-end. In this paper, we demonstrate idea identification (ADI) for speaker diarization. For this, first present detailed study various domains third DIHARD challenge highlighting factors that differentiated them from each other. Our main contribution develop simple and efficient solution ADI. work, explore embeddings task. Next, integrate ADI with diarization framework III challenge. The performance substantially improved baseline thresholds agglomerative hierarchical clustering were optimized according respective domains. We achieved relative improvement than $$5\%$$ $$8\%$$ DER core full conditions, respectively, on Track 1 evaluation set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Speaker Diarization for meetings

This thesis shows research performed into the topic of speaker diarization for meeting rooms. It looks into the algorithms and the implementation of an offline speaker segmentation and clustering system for a meeting recording where usually more than one microphone is available. The main research and system implementation has been done while visiting the International Computes Science Institute...

متن کامل

Speaker Diarization in Meetings Domain

The purpose of this study is to develop robust techniques for speaker segmentation and clustering with focus on meetings domain. The techniques examined can however be applied to any other domains such as telephone and broadcast news. Traditional techniques for speaker diarization developed for telephone conversations or broadcast news are based on a single channel, which is notably different f...

متن کامل

Speaker Diarization Using a priori Acoustic Information

Speaker diarization is usually performed in a blind manner without using a priori knowledge about the identity or acoustic characteristics of the participating speakers. In this paper we propose a novel framework for incorporating available a priori knowledge such as potential participating speakers, channels, background noise and gender, and integrating these knowledge sources into blind speak...

متن کامل

A sticky HDP-HMM with application to speaker diarization

We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to spe...

متن کامل

Robust Unsupervised Speaker Segmentation for Audio Diarization

Audio diarization Reynolds & Carrasquillo (2005) is the process of partitioning an input audio stream into homogeneous regions according to their specific audio sources. These sources can include audio type (speech, music, background noise, ect.), speaker identity and channel characteristics. With the continually increasing number of larges volumes of spoken documents including broadcasts, voic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Speech Technology

سال: 2022

ISSN: ['1381-2416', '1572-8110']

DOI: https://doi.org/10.1007/s10772-022-09990-9